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Natural supports of information are given by random copolymers such as DNA or RNA where 
information is coded in the sequence of covalent bonds. At the molecular scale, the stochastic growth 
of a single copolymer with or without a template proceeds by successive random attachments or 
detachments of monomers continuously supplied by the surrounding solution. The thermodynamics 
of copolymerization shows that fundamental links already exist between information and thermody- 
namics at the molecular scale, which opens new perspectives to understand the dynamical aspects 
of information in biology. 

Keywords: Thermodynamics of copolymerization; entropy production; stochastic processes; informa- 
tion theory; Shannon disorder; mutual information; mutations; DNA replication; DNA sequencing. 

I. INTRODUCTION 

Under noncquilibrium conditions, the emergence of dynamical order is already in action at the molecular scale during 
copolymerization processes. Copolymers are special because they constitute the smallest physico-chemical supports of 
information. Little is known about the thermodynamics and kinetics of information processing in copolymerizations 
although such reactions play an essential role in many complex systems, e.g. in biology. In this context, recent advances 
have been performed which shed a new light on the nonequilibrium constraints required to generate information-rich 
copolymers QM]. 

Natural supports of information are given by random copolymers where information is coded in the sequence of 
covalent bonds, as already suggested by Schrodinger with his concept of aperiodic crystal [5]. Random copolymers 
exist in chemical and biological systems. Examples are styrene-butadicne rubber, proteins, RNA, and DNA, this 
latter playing the role of information support in biology. 

At the molecular scale, the stochastic growth of a single copolymer proceeds by successive random attachments or 
detachments of monomers {to} continuously supplied by the surrounding solution: 

TOiTO2---TOf_i + mi ^ TO4TO2 • • • 77l/_iTO; (1) 

The solution is supposed to be sufficiently large to play the role of a reservoir where the concentrations of monomers 
are kept constant. In this regard, the stochastic growth of a single copolymer is modeled by a Markovian process with 
transition rates depending on the fixed concentrations of monomers in the surrounding solution [Irll]- 

According to local detailed balancing, the rates of forward and reversed transitions have ratios that are determined 
by the free energies of the copolymers TO1TO2 • • • mi in physical equilibrium with the surrounding solution. Thermo- 
dynamic quantities can thus be defined and their time evolution studied during the copolymerization process. In this 
way, fundamental relationships can be established between the thermodynamics of copolymerization processes and 
the information content encoded in growing copolymers [HSj- The purpose of this communication is to present the 
latest results obtained in this framework. 



II. RESULTS 

As shown in Ref. [T] for copolymerization with or without a template, the thermodynamic entropy production is 
related not only to the average value of the free energy per monomer in the grown sequence, but also to the Shannon 
disorder of the sequence itself. This result is at the origin of dissipation-error tradeoff during copolymer growth [6] . 

Two growth regimes are identified: 

(1) A regime close to the thermodynamic equilibrium where the copolymer can grow in an adverse free-energy 
landscape by the entropic effect of its Shannon disorder. In this regime, the disorder of the grown sequence dominates 
the process even in the presence of a template, in which case the copying process generates a lot of errors. 

(2) A regime farther away from equilibrium where the growth proceeds because the free energy of monomer attach- 
ment is favorable. In this regime, the error rate drops to low values. 
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FIG. 1: Stochastic growth of the copolymer uj on the template a, as simulated by Gillespie's algorithm with the parameter 
VallieS /^correct = 1, fc crror = 0.5, fe fl = 10 3 , and [2] = 1.3 x 10 3 . The template is generated by a Bernoulli process of 
probabilities (|, |). (a) Space-time plot at the concentration [1] = 2 x 10~ 3 ; (b) The mean growth velocity v, the Shannon 
disorder D(oS) of the copy w, the Shannon disorder D(tj\a) of the copy u) conditioned to the template sequence a, and the 
mutual information I(tJ, a) between the copy uj and the template a versus the concentration [1] of the monomers of species 1. 



In Refs. [TJ |3J, these regimes were studied as a function of the free energy driving force. In Ref. @J, results have 
been reported on a model of free copolymerization where the attachment and detachment rates are controlled by the 
concentrations of monomers in the surrounding solution. In the present communication, this study of the dependence 
on monomeric concentrations is extended to a model of copolymerization with a template. The template as well as 
the growing copy are composed of two monomers to = 1 and to — 2. The pairs 1-1 and 2-2 are favored between the 
template a and the copy uj. The pairs 1-2 and 2-1 are considered as errors during information transmission from the 
template to the copy. The kinetic mechanism of elongation is the following: 

a : ni n 2 ■ ■■ n ; _i n t n t+1 ■■ ■ _^ ni n 2 • • • n;_i m ni +1 ■■ ■ , . 

uj : mim2 • • • mi-i + mi ^~ TO1TO2 ■ ■ ■ mi-\mi 

The attachment rates are given by w +m \ n = k +m \ n [m] and the detachment rates by W- m \ n = fe_ m |„. The attachemcnt 
rates are proportional to the monomeric concentrations [to], while the detachment rates are not since detachments 
do not need the presence of monomers in the surrounding solution. The rates are supposed to be independent of 
the end to;_i of the copy w, which is a simplifying assumption. The rate of formation of correct pairs is defined as 
^correct = = ^+2|2 and the error rate as fc crr0 r = fc+112 = ^+211- All the detachment rates are assumed to take 

the same value: k ^ = = fe_i|2 = 21 1 = fe 212 - 

The stochastic process is simulated with Gillespie's algorithm. Figure[TJa) illustrates the fluctuating growth of a copy 
in space and time. In this example, the error rate is larger because of the smallness of the ratio fc corrcct /k crmr = 2. This 
ratio is determined by the strength of the pairing bonds. Figure [ljb) depicts the mean growth velocity as well as the 
quantities characterizing the information content of the copy to compared to the template a versus the concentration 
of monomers 1 in the surrounding solution. The mean growth velocity vanishes at equilibrium, which exists at the 
concentration [l] eq ~ 0.3 x 10~ 4 if [2] = 1.3 x 10~ 3 . Both the Shannon disorder D(oj) of the copy and the conditional 
disorder D(uj\a) of the copy with respect to the template are larger than the mutual information a) between the 
copy and the template. In any case, the three quantities are related to each other by the well-known formula 

I{u,a)=D(w)-D(w\a) (3) 

from information theory [J [3] . The mutual information characterizes the fidelity of information transmission between 
the template and the copy. Figure |T|(b) shows that this fidelity decreases close to equilibrium. The reason is that the 
out-of-equilibrium directionality is lost close to equilibrium where fluctuations go either forward or backward because 
the principle of detailed balancing prevails at equilibrium. Consequently, there is a multiplication of errors close to 
equilibrium. This error catastrophe is avoided by maintaining the system far enough from equilibrium. 
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III. CONCLUSIONS AND PERSPECTIVES 



The results show that fidelity in copying a copolymer requires the supply of enough free energy from the attachment 
of monomers. In this respect, the nonequilibrium driving should exceed a critical value in order to transmit information 
in copolymerization processes with a template, such as DNA replication flj . The statement by Manfred Eigen that 
"information cannot originate in a system that is at equilibrium" |10) is rigorously proved in the present framework. 
The thermodynamics of copolymerization thus shows that fundamental links already exist at the molecular scale 
between information and thermodynamics [IH1] . 

The transition between the two growth regimes could be experimentally investigated in chemical or biological 
copolymerizations. In polymer science, methods have not yet been much developed to perform the synthesis and 
sequencing of copolymers for the information they may support. However, such methods are already well developed 
for DNA and under development for single- molecule DNA or RNA sequencing [7H5]. These methods could be used 
to test experimentally the predictions of copolymerization thermodynamics by varying NTP and pyrophosphate 
concentrations to approach the regime near equilibrium where the mutation rate increases. 

These considerations open new perspectives to understand the dynamical aspects of information in biology. During 
copolymerization processes with a template (as it is the case for replication, transcription or translation in biological 
systems), information is transmitted although errors may occur due to molecular fluctuations, which are sources of 
mutations. The two main features of biological systems - namely, metabolism and self-reproduction - turn out to be 
related in a fundamental way since information processing is constrained by energy dissipation during copolymeriza- 
tions. Moreover, the error threshold for the emergence of quasi-species in the hypercycle theory by Eigen and Schuster 
[TT] could be induced at the molecular scale by the transition towards high fidelity replication beyond the transition 
between the two growth regimes [12:. In this way, prebiotic chemistry could be more closely linked to the first steps 
of biological evolution. 
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